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Abstract 

Background: Stochastic biochemical reaction networks are commonly modelled by the chemical master equation, 
and can be simulated as first order linear differential equations through a finite state projection. Due to the very high 
state space dimension of these equations, numerical simulations are computationally expensive. This is a particular 
problem for analysis tasks requiring repeated simulations for different parameter values. Such tasks are 
computationally expensive to the point of infeasibility with the chemical master equation. 

Results: In this article, we apply parametric model order reduction techniques in order to construct accurate 
low-dimensional parametric models of the chemical master equation. These surrogate models can be used in various 
parametric analysis task such as identifiability analysis, parameter estimation, or sensitivity analysis. As biological 
examples, we consider two models for gene regulation networks, a bistable switch and a network displaying 
stochastic oscillations. 

Conclusions: The results show that the parametric model reduction yields efficient models of stochastic biochemical 
reaction networks, and that these models can be useful for systems biology applications involving parametric analysis 
problems such as parameter exploration, optimization, estimation or sensitivity analysis. 

Keywords: Stochastic biochemical network, Model reduction, Reduced basis, Genetic regulatory network, 
Computational efficiency, Parameter estimation 



Background 

The chemical master equation (CME) is the most basic 
mathematical description of stochastic biomolecular reac- 
tion networks [1,2]. The CME is a generally infinite- 
dimensional linear differential equation. It characterizes 
the temporal development of the probabilities that the 
network is in any of its possible configurations, where the 
different configurations are characterized by the molecu- 
lar copy numbers of the networks chemical species. 

Due to its infinite dimension, the CME is usually not 
directly solvable, not even with numerical methods. A 
recent breakthrough in the numerical treatment of the 
CME was the establishment of the finite state projec- 
tion (FSP) method by Munsky and Khammash [3]. They 
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showed that it is possible to compute a good approxi- 
mation to the real solution by projecting the CME to a 
suitable finite subdomain of the networks state space, and 
solving the resulting finite-dimensional linear differential 
equation on that domain. Nevertheless, the FSP approach 
still yields very high-dimensional models which are com- 
putationally expensive to simulate, even for small bio- 
chemical networks. The efficient simulation of the CME 
is an area of active research, and recently other simulation 
methods have been developed that can also be used for 
larger networks [4,5]. 

Despite this progress, the direct simulation of the CME 
remains a computational bottleneck for common model 
analysis tasks in systems biology. It is especially problem- 
atic for tasks which require the repeated simulation of the 
model using different parameter values, for example iden- 
tifiability analysis, parameter estimation, or model sensi- 
tivity analysis. Thereby, while a single or a few evaluations 
of a CME model with the FSP or other approaches may 
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still be computationally feasible, the necessity of many 
repeated simulations will quickly render higher-level anal- 
ysis tasks infeasible. 

Mathematical methods that approximate the behaviour 
of a high-dimensional original model through a low- 
dimensional reduced model are a common way to deal 
with complex models. Especially for linear differential 
equations, model order reduction is a well established 
field and several methods to compute reduced order mod- 
els are available [6]. Note that the step of generating a 
reduced model is usually computationally more expensive 
than a single or even a few simulations of the origi- 
nal high-dimensional model. But the simulation of the 
resulting reduced models is frequently orders of magni- 
tude faster than the solution of the original model. So, 
model reduction is worth the effort if many repeated sim- 
ulations are to be expected. Unfortunately, for analysis 
tasks which require the repeated model simulation with 
different parameters, classical model reduction methods 
are not helpful. With these methods, the reduced model 
depends on specific parameter values in the original 
model, and the reduction needs to be redone for different 
parameter values. Thus, for the mentioned analysis tasks, 
the model reduction process would have to be repeated 
for each new parameter value, and no gain in computa- 
tional efficiency would typically be possible. While classi- 
cal model reduction techniques have been applied to the 
CME in the past [7], they are not so suitable for parametric 
analysis tasks. 

Fortunately, model reduction methods where parame- 
ters from the original model are retained as adjustable 
parameters also in the reduced model are now being 
developed. These methods allow to compute a reduced 
model which uses the same parameters as the original 
model, and where the reduced model can directly be 
simulated with any choice of parameter values [8-11]. 

The purpose of this paper is to introduce the application 
of these parametric model reduction methods to finite- 
state approximations of the chemical master equation, and 
to show possible usage scenarios of such an approach. 
The structure is as follows. In the following section, we 
introduce some background and notation concerning the 
modelling of chemical reaction networks and parametric 
model order reduction. We also show how the paramet- 
ric model order reduction methods can in fact be applied 
to the CME. Afterwards, we apply the reduction tech- 
nique on two reaction network models and corresponding 
parametric analysis tasks. 

Methods 

We start with some preparatory background on the chem- 
ical master equation (CME) and parametric model order 
reduction. This serves in particular to fix the notation 
used throughout the remainder of the article. Then the 



application of parametric model order reduction to the 
CME is introduced. 

The chemical master equation 

The structure of a biochemical reaction network is charac- 
terized completely by the list of involved species, denoted 
as X\,X2 . . . ,X n , and the list of reactions, denoted as 

n n 

^OijXi^^cpijXi, ; = l,...,m, (1) 

i=l i=l 

where m is the number of reactions in the network, and 
the factors Oij e No and <pij e No are the stoichiometric 
coefficients of the reactant and product species, respec- 
tively [12]. The net change in the amount of species i 
occuring through reaction j is given by 

Nij = <pij - Oij. (2) 

Reversible reactions can always be written in the form (1) 
by splitting the forward and reverse path into two separate 
irreversible reactions. 

For a stochastic network model, the variables of inter- 
est are the probabilities that the network is in any of the 
possible states which are characterized by the molecular 
copy numbers of the individual species X\,X2 . . . ,X n . We 
denote the molecular copy number of X{ by [X{\ e No. 
Then, the state variables of the stochastic model are given 
by the real numbers 

p(t,x\,X2, ...,#») =Prob([Xi] = xi, [X 2 ] = X2, 

. . . , [X n ] = x n at time t), (3) 

for Xi e No, i = 1, As a short-hand notation for (3), 
we write p(t,x), with x £ Nq. 

The transitions from one state to another are deter- 
mined by chemical reactions according to (1). The 
changes in the molecule numbers are described by the 
stoichiometric reaction vectors 

Vj = (Ny N 2J . . . N nj ) T e Z\ (4) 

To avoid needlessly complicated cases, we assume Vj ^ 
for ; 7^ k. 

The probabilities of the network being in any of the pos- 
sible states x evolve over time, and their evolution is gov- 
erned by the chemical master equation (CME) as derived 
by [1]. From a given molecular state x, one can compute 
the propensity vj that reaction j takes place according to 
the law of mass action as 

Hj(*.*> = *yn(*). (5) 

where 0 = {Oj) f fL 1 is the vector of reaction rate constants, 
which are model parameters depending on the physical 
properties of the molecules involved in the reactions. The 
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propensities are related to the probability that reaction / 
will occur in a short time interval of length dt when the 
system is in state x: 

Prob (reaction / occurs in [ t, t + dt] \ [X] = x) =vj(x, 0)dt 

+ o(dt). 
(6) 

Taking the possible transitions and the corresponding 
reaction propensities together yields the chemical master 
equation (CME), a linear differential equation where the 
variables are the probabilities that the system is in each of 
the possible molecular states x: 



dt 



p(t,x) = (vj(x—Vj,6)p(t,x— Vj) — Vj {x 1 0)p{t 1 x)) 1 

(7) 



for x £ Nq. The CME (7) is subject to an initial condition 
p(to,x) = po(x) for: x e Nq. 

Despite being linear, the CME is hard to solve numeri- 
cally. This is due to the problem that the state space is for 
most systems infinite-dimensional, since all possible states 
x £ Nq of the reaction network (1) must in general be con- 
sidered. Instead of directly solving the CME (7), a number 
of alternative approaches to study the stochastic dynam- 
ics of biochemical reaction networks have been suggested. 
The most common approach is to generate a simulated 
realization of the stochastic process described by the reac- 
tion network (1), using for example the Gillespie algorithm 
[13]. In this approach, the probabilities p{t,x) for the pos- 
sible system states are obtained from many simulated 
realizations. However, since this requires a large number 
of realizations, it is computationally expensive. 

As a more direct approach, Munsky and Khammash [3] 
have proposed the finite state projection (FSP), where the 
CME is solved on a finite subset of the state space. Here, 
this subset is denoted by Q, and is defined as 



Q = [ X ® | i = i, . . . , d] C Ng, 



(8) 



where the x® are the system states for which the probabil- 
ities are computed in the projected model. The underlying 
assumption is that the probabilities for other states will be 
very low on the time scale of interest — otherwise the FSP 
may not yield good approximations to the solution of the 
CME. In particular we assume the time interval of interest 
to be given by [0,T] for final time T > 0. The probabil- 
ities for the states x® in £2 are written in the vector P(t) 
approximating p(x,t) at the finite number of states Q: 



P{t) = (Pi(t)) i=li ... 4 * (p(t, x (iy )) el 0, l] d . 

v / i=i,.. .,d 



The equation to be solved with the FSP approximation is 
d 



dt 



P{t) =A(0)P(t) forte (0, T) 



(10) 



P(P) = P 0 , 



where A(0) e R dxd is the matrix of state transition 
propensities, and Po = (po(%®)) i= i j 1S a vector of ini- 
tial probabilities for the states in Q. The elements of the 
matrix A (0) are computed as 



Au(6) = -J2vr(* (i) ,0) 
Aij(0) = 



r=l 



Vrix®, 0) if x® = x (i) + v r , r = 1, . . . , m 
0 otherwise. 



(id 



We will frequently omit the parameter dependence of 
the solution (and other parametric quantities). Hence 
the solution P(t), as abbreviation of P(t,0), of (10) is an 
approximation to the solution p(t,x) of the orginal CME 
on the domain Q. Munsky and Khammash [3] have also 
derived an upper bound on the error between the solution 
P(t) computed via the FSP, and the solution of the original 
CME p(t,x) on Q. 

Here, we consider in addition an output vector y e MP 
defined by 

y(t) = CP(t), (12) 

with C e W xd . Examples for relevant outputs are the 
probability that the molecular copy numbers are in a cer- 
tain domain Q C Q, which is achieved by the row vector 
output matrix C defined by Q = 1 if x® e other- 
wise Q = 0, with p=l, or the expected molecular copy 
numbers, given by 



y e (t) = jr t x®P i (t), 



(13) 



i=l 



(9) 



i.e. C = . . . ,x {d) ) with p=n. 

The basic motivation for the model reduction presented 
here is that we are interested in parametric analysis of 
the model, where the model (10) has to be solved many 
times with different values for the parameters 0. Due to 
the typical high dimensions of the matrix A (0), already a 
single simulation is computationally expensive, and anal- 
ysis tasks requiring many repeated simulations are often 
computationally infeasible. Thus, the primary goal is to 
derive a reduced model which is rapidly solvable and pro- 
vides an approximation y(t) to the output y(t), potentially 
without any consideration of the original state vector P(t). 

Order reduction of parametric models 

Model order reduction of parametric problems is a very 
active research field in systems theory, engineering and 
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applied mathematics. We refer to [8,10,11] and references 
therein for more information on the topic. 

Here, we apply the reduction technique for paramet- 
ric problems presented in [9] adopted to our notation. It 
is based on two biorthogonal global projection matrices 
V, W £ R dxr with r « d and W T V = Id, where r is the 
dimension of the reduced model. The matrix V is assumed 
to span a space that approximates the system state vari- 
ation for all parameters and times. The construction of 
such matrices will be detailed in the next subsection. 

The gain of computational efficiency in repeated simu- 
lations comes from a separation of the simulation task into 
a computationally expensive "offline" phase and a com- 
putationally cheap "online" phase. In the offline phase, 
suitable projection matrices V and W are computed with- 
out fixing specific parameter values. With the projection 
matrices, a reduced model with the same free param- 
eters as the original model is computed. In the online 
phase, the reduced model is simulated with the actu- 
ally chosen parameter values, which is typically several 
orders of magnitude faster than the simulation of the orig- 
inal model. For analysis tasks with repeated simulations, 
only the online phase has to be repeated for different 
choices of the parameter values, yielding an overall gain in 
computational efficiency. 

Decomposition in parametric and non-parametric part 

The reduction technique assumes a separable parameter 
dependence of the full system matrices and the initial con- 
dition. This means, we assume that there exist a suitable 
small constant Qa e N, parameter independent com- 
ponents e M. dxd and parameter dependent scalar 
coefficient functions (0) for q = 1, . . . , Qa such that 

Qa 

A(6) = J2#A ] (°) A[q] ( 14 ) 

q=l 

and similarly for the system matrix C and initial condi- 
tion Pq. We assume that 0 e V stems from some domain 
V C M m of admissible parameters. In the next step, the 
reduced component matrices and initial conditions are 
determined by 

A¥ ] :=W T AMV, cl q] :=C [ « ] V, P [ « ] := W T P [ « ] . 

(15) 

for q = 1,...,Qa. The resulting quantities a}?\ c[ q \ and 
PqP are r-dimensional vectors or matrices and indepen- 
dent of the high dimension d. The basis computation and 
the computation of these reduced system components 
is performed once and parameter-independently in the 
offline-phase. Then, in the online-phase, for any new 



parameter 0 the reduced system matrices and the initial 
condition are assembled by 

Qa 

A r (0) = J2» [q] W)Ar ] ( 16 ) 

q=l 

and similarly for P r o(0) and C r (0). The low dimensional 
reduced system that remains to be solved is 

^-P r (t)=A r (0)P r (t) forte (0,7) 
at 

P r (0)=P r0 (0) (17) 

y{t) = C r (0)P r (t). 

From the reduced state P r (t), an approximate state for the 
full system can be reconstructed at any desired time by 
P(t) = VP r (t). Also the difference between the approx- 
imated output y(t) and the output y(t) of the original 
model can be bounded by so called error estimators. 
A-posteriori error bounds for the reduced systems as 
considered here are given in [9] . 

Basis generation 

Different methods for the computation of the projec- 
tion bases V and W exist. In systems theory, methods 
like balanced truncation, Hankel-norm approximation 
or moment matching are applied, that approximate the 
input-output behaviour of a linear time-invariant system 
[6]. The resulting reduced models can be applied for vary- 
ing input signals. Extensions to parametric problems exist, 
e.g. [8,11]. As we do not have varying inputs in the prob- 
lem studied here, we consider snapshot-based approaches 
to be more suitable. This means, the projection bases are 
constructed by solution snapshots, i.e. special solutions 
computed for selected parameter values. 

The generation of projection matrices V and W must 
be done in such a way, that they are globally well approx- 
imating the system states over the parameter and time 
domain. A possible way to achieve this is the POD- 
Greedy algorithm, which has been introduced in [14] 
and is meanwhile a standard procedure in reduced basis 
methods [15,16]. The algorithm makes use of a repeated 
proper orthogonal decomposition (POD) of trajectories 
P :[ 0, T] -> R^, which for our purposes can be denned as 

POD(P) := arg min f \ \P(t) - (y T P(t))v\ \ 2 dt. 
veR d ,\\v\\=l Jo 

(18) 

Intuitively, POD(P) e M. d is a state space vector rep- 
resenting the single dominant mode that minimizes the 
squared mean projection error. Computationally, this 
minimization task is solved by a reformulation as a suit- 
able eigenvalue problem. Consider the correlation matrix 
C = J* P(t)P(t) T dt. Then, v* = POD(P) e R d is an 
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eigenvector corresponding to the largest eigenvalue X max 
of C, i.e., Cv* = X max v*. For additional theoretical and 
computational details on POD we refer to [17,18]. We fur- 
ther require a finite subset of parameters V train C V, that 
are used in the basis generation process. As error indicator 
A(0, V) we use the projection error of the full system tra- 
jectory on the reduced space spanned by the orthonormal 
columns of V, i.e. 

A(0, V) := / \\P(t,0) - W T P(t,0)\\ 2 dt. (19) 
Jo 

The POD-Greedy procedure which is given in the pseudo- 
code below, starts with an arbitrary orthonormal initial 
basis Vm 0 £ R dxN ° and performs an incremental basis 
extension. The algorithm repeatedly identifies the cur- 
rently worst resolved parameter (a), orthogonalizes the 
corresponding full trajectory with the current reduced 
space (b), computes a POD of the error trajectory (c), and 
inserts the dominant mode into the basis (d). 
function V = POD-Greedy (7^^, V No , s to i) 

1. N:=N 0 

2. while s N := max^ ?fraif! A(0, V N ) > s to i 

(a) 0* := argmaxfleT^ A(0, V^) 

(b) £(0 := P(f,0*) - V^(V^P(f,0*)) 

(c) Viv+i := POD(E) 

(d) Vath-i :=[ Vat, vat+i] 

(e) Af:=Af + l 

3. end while 

Note that the algorithm is implemented such that the 
simulation of the full model, yielding P(t, 0) in (19), is only 
performed once for each 0 in the training set V train- 

For concluding the basis generation, we set W := V \ 
This satisfies the biorthogonality condition W T V = Id, as 
V has orthonormal columns by construction. In practice 
the time-integrals in (18) are realized by a finite sampling 
of the time interval. 

A theoretical underpinning for the POD-Greedy algo- 
rithm has recently been provided by the analysis of con- 
vergence rates [19]. This is based on the approximation- 
theoretical notion of the Kolmogorov n-width d^(F) of a 
given set T C M. d , which quantifies how well the set can 
be approximated by arbitrary Af-dimensional linear sub- 
spaces of M. d . The convergence statement for the case of 
exponential convergence then can be summarized as fol- 
lows: If the set of solutions T := {P(t,0)\t e[0, T],0 e 
V} C M. d is compact and has an exponentially decay- 
ing Kolmogorov n-width dj^iJ 7 ) < Me~ aN for some 
M, a, a > 0 and all N e N, then the error sequence 
(sn)neN generated by the POD-Greedy procedure (cf. the 
definition in Step 2. in the pseudo code) also decays with 



an exponential rate, sn < CMe~ cN ^ with suitable con- 
stants ^,c, C > 0 depending on M.a.a. Thus, if the 
set of solutions can be approximated by linear subspaces 
with an exponentially decaying error term, then the POD- 
Greedy algorithm will in fact find an approximation with 
an exponentially decaying error term, though possibly 
with suboptimal parameters in the error bound. 

Extensions of the POD-Greedy algorithm exist, e.g. 
allowing more than one mode per extension step, per- 
forming adaptive parameter and time-interval partition- 
ing, or enabling training-set adaptation [15,16,20]. 

Reduced models of the parametrized chemical master 
equation 

In this section, we describe how to apply the reduction 
method for parametrized models presented in the pre- 
vious section to FSP models for the chemical master 
equation. 

As discussed in the previous section, the first step in 
the proposed reduction method is a decomposition of 
the d-dimensional system matrix A(0) as in (14). Such 
a decomposition is possible for the case of mass action 
reaction propensities, as denned in (5), or generalized 
mass action, as recently suggested for the chemical mas- 
ter equation [21]. In this case, the length of the parameter 
vector 0 is equal to the number of reactions m, and we 
decompose A (0) into m terms as 



A(0) = 0iA [l] + ...+6 m A [n 



(20) 



Hence, concerning the notation given before, we have 
Qa = m components A^ and coefficient functions 
Oq. Each matrix A^ in this decomposition 



comes from just the transition propensities corresponding 
to reaction q, and is denned by 



4? ] =-rKv*« 



M _ 

A a - 



Y\(x J k ) Y k " if* 01 =x {i) + v q 
k=i 

0 otherwise. 



(21) 



More generally, such a decomposition is also possible if 
reaction rate propensities can be decomposed into the 
product of two terms, with the first term depending on 
parameters only, and the second term on molecule num- 
bers only. This case is for example encountered when 
the temperature-dependance of the reaction rate constant 
is relevant, and the temperature T is a variable param- 

-E A 

eter in the Arrhenius equation 0 = Ae RT . Since the 
output matrix C and the initial condition Pq are usu- 
ally not depending on parameters in this framework, a 
decomposition of C and Pq is not considered. 
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The situation is more difficult for reaction propensities 
involving for example rational terms with parameters in 
the denominator. The denominator parameters can not 
be included in the reduced order model by the decom- 
position outlined in (20) and (21). If variations in these 
parameters are however not relevant to the planned anal- 
ysis, then they can be set to their nominal value, and the 
decomposition can directly be done as described above. 
Alternatively, approximation steps can be performed, such 
as Taylor series expansion or empirical interpolation 
[22], that generate an approximating parameter-separable 
expansion. 

Results 

In this section, we present the study of two example net- 
works with the proposed model reduction method. With 
these examples, the applicability of the reduced modeling 
approach especially for analysis tasks requiring repeated 
simulations with different parameter values is illustrated. 
The first network is a bistable genetic toggle switch, where 
cells may switch randomly between two states, based on 
the model in [23] . For this network, the problem of param- 
eter estimation with a reduced model is studied. The 
second network is a second-order genetic oscillator, based 
on [24], where we perform a sensitivity analysis over a 
wide parameter range. 

Parameter estimation in a genetic toggle switch model 
Network description 

The genetic toggle switch considered here is an ovarian 
follicle switch model from [23] . It is a system of two genes 
which activate each other. The switch is modelled as a 
reaction network with two species X\, X 2 , representing 
the gene products. The network reactions are specified in 
Table 1, and the network parameters in Table 2. 

In [23], this network was shown to describe a bistable 
switch with two probability peaks, one close to x^ 0 ^ = 
(0, 0) T and the other close to x (on) = ( Vi, V 2 ) T . 

In the study [23], only the lower probability peak was 
of interest. Here, we are interested in the transition of the 
system from x^ 0 ^ to x^ on \ Therefore, the system is trun- 
cated to a rectangle Q := {0, . . . , 150} x {0, . . . , 150} such 
that x^ on \x^ 0 ^ e Q, yielding a good approximation in the 



Table 2 Parameters for the follicle switch model 



Table 1 The follicle switch model 


Reaction 


Stoichiometry vj 


Propensity v y 


Production of X] 


(1,0) T 


^ + wk ] 


Degradation of X] 


(-1,0) T 


U]X] 


Production of X 2 


(0,1 ) T 




Degradation of X 2 


(0,-1) T 


u 2 x 2 





V^ 


M^ 


U! 


v 2 


M 2 


u 2 


4 


75 


25 




75 


25 





Parameter values for the follicle switch model in Table 1 . 



finite state projection to the infinite-dimensional chemical 
master equation. 

The next step is to apply the decomposition of the 
matrix A(0) as described in the methods section. Note 
that A(0) for the switch network contains rational terms 
with the parameters M\ and M 2 . Considering these two 
parameters as fixed quantities, the truncated CME for the 
follicle switch can be written as 



P{t) = (hA [1] + V x A LZJ +wiA L ^ J + V 2 A m +u 2 A^)P(t), 

(22) 

where i = 1, . . . , 5 are of dimension 151 2 x 151 2 = 
22801 x 22801. 

As initial condition we choose a probability distributed 
over some lower states 



\[2] 



[3]_ 



I [4]. 



p(Q,x) = 



for x\ + x 2 < 20 

210 " 

0 otherwise. 



(23) 



List of reactions and reaction propensity functions for the follicle switch model 
[23]. 



For the parametric model reduction, we consider only 
variations in the parameters u\ and u 2 . These influence 
both the steady state level of gene activity in the on-state 
as well as the switching kinetics and are thus of high 
biological significance in the model. Hence we set 0 := 
(hi, u 2 ) t e[ 0.005, 0.02] 2 as the parametric domain V. As 
final time we choose T = 10 7 which corresponds to a time 
range of approximately 19 years, i.e. about three times the 
half-life time of the off-state estimated in [23]. 

Some state plots from the simulation of the full model 
are shown in Figure 1. These snapshots clearly show 
the transition of the switch from the off- state with low 
values for x\ and x 2 to the on state with high values. 
The parameter influence is mainly reflected in the speed 
of the transition: for the parameter vector (u\, u 2 ) = 
(0.005, 0.02) in the lower row, most of the probability is 
already arranged around the on-state at the end of the 
simulation time. In contrast, for the parameter vector 
(ui, u 2 ) = (0.05, 0.005) in the upper row, a signifi- 
cant portion of the probability is still located around the 
off-state at this time point. Also, the transition paths are 
different: in the first case, the values for x 2 are lower than 
the values for x\ during the transition, while in the second 
case, this relation is reversed. 

As typical simulation time for a single trajectory of the 
full system, we obtain 98.2 seconds on a IBM Lenovo 2.53 
GHz Dual Core Laptop. 
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O O 



Species X 



Figure 1 Illustration of solution snapshots of the switch model. Illustration of some solution snapshots P(t) of the switch model (22) for 
parameter vector (U] , u 2 ) = (0.05, 0.005) (upper row) and (i/i , u 2 ) = (0.005, 0.02) (lower row) at times t=0, 2 • 1 0 5 , 5 • 1 0 6 , and 1 • 1 0 7 from left to right. 



Basis generation 

We generated a reduced basis with the POD-Greedy algo- 
rithm, where the training set was chosen as the vertices of 
a mesh with 9 2 logarithmically equidistant parameter val- 
ues over the parameter domain V. We set e to i = 10 -12 
as target accuracy. We use the projection error as error 
measure, hence precompute the 81 trajectories for con- 
struction of the reduced basis. As initial basis we set No = 
1 and Vjv 0 := Po using the parameter independent initial 
condition. 

The POD-Greedy algorithm produces a basis of 33 vec- 
tors and the overall computation of the reduced basis 
takes 7.9 hours, the dominating computation time being 
spent in the error evaluations and POD computations. 
Some of the resulting orthonormal basis vectors are illus- 
trated in Figure 2. The error decay curve and the selected 



parameters in the parameter domain are illustrated in 
Figure 3. We nicely observe an exponential error decay, 
which indicates a parametric smoothness of the solution 
manifold, cf. the convergence rate statement given before 
for the POD-Greedy algorithm. The selected parameters 
seem to be located at the boundary of the parameter 
domain, indicating that the model behaviour in between 
can well be interpolated from the model behaviours along 
the boundary of the parameter domain. 

The final reduced model of dimension 33 can then be 
simulated in 0.135 seconds, corresponding to a computa- 
tional speedup factor of more than 700. 

Parameter estimation 

We exemplify a possible application of the reduced order 
model in parameter estimation, where we assume that a 









• 






• 





Figure 2 Basis vectors for the switch model. Illustration of the first eight basis vectors for the switch model generated by the POD-Greedy 
algorithm. 
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POD-Greedy error decay 



PODGreedy selected parameters 




10 15 20 25 

number of basis vectors 



Figure 3 Results of the POD-Greedy algorithm for the switch model. Illustration of the error decay during the POD-Greedy algorithm (left) 
applied to the switch model and the selected parameters (right) being a small subset of the 81 training parameter points. 



distorted output y(t) as the expected values E[x{\ is avail- 
able from population-averaged measurements. The task is 
to estimate the parameter values u\ and u<i from such a 
noisy measurement. 

The reference parameter is 0 re f = (^1,^2) = 
(0.01, 0.01) r , and, for the purpose of this example, the 
measured output is produced by simulating the original 
model with the reference parameter values and adding 5% 
relative random white noise n(t) sampled from a standard 
normal distribution, y meas (t) := y(t, 0 re f)(l-\-0.05n(t)). An 
illustration of the reference output y(t, G re f) and the noisy 
signal ymeasif) is given in the left of Figure 4. 

We want to recover the values of the parameters u\ 
and U2 based on fitting the reduced parametric models 
output y(t, 0) to the measured output y me as{t)< As is com- 
monly done in parameter estimation, we formulate a least 
squares cost function as 



jm 



-f 

Jo 



and estimate the parameters by 



9 est = argmin/(#). 



(25) 



(ymeas(t)-y(t,0)fdt, 



(24) 



In such an optimization problem, typically many for- 
ward simulations are required for adjusting y to the 
measurement. This is a particular beneficial scenario for 
reduced order models, as these simulations can be com- 
puted rapidly. 

In order to gain a deeper insight into the optimization 
problem (25), we plot the values of the error functional 
7(0) over the parameter domain (middle of Figure 4). 
Using the reduced model, the computation of the required 
21 2 = 441 trajectories is realized in less than one minute. 
This would be a significant computational effort when 
using a non-reduced model. 

From the cost function plot, we observe a narrow area of 
parameters which seem to produce a similar output as the 
reference parameter 6 re f. This shows that the two model 
parameters are not simultaneously identifiable from the 
considered output, and indicates that there may exist a 




Figure 4 Parametric analysis for the reduced switch model. Application of parametric reduced models for parametric analysis: Illustration of the 
clean and noisy signals y(t,O re f) andy meas (r), respectively (left), the optimization target J (6) over the parameter domain (middle), interactive 
parameter exploration by a graphical user interface (right). 
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functional dependence between the parameters u\ and u<i 
such that the model yields similar outputs y(t). 

Assuming a functional dependence of u\ and u<i we 
now consider the 1 -dimensional optimization problem 
along the line u<i — U2, re f = 0-01- We would like to 
recover u\ from the optimization problem. The corre- 
sponding value of the cost function is J(6 re f) = 3330.68, 
indicating a significant contribution of the noise. This 
restricted optimization problem is well conditioned and 
the optimization with a standard active set algorithm 
by MATLABs command f mi neon yields the estimated 
parameter 0 est := (ui >est , 0.01) with ui >est = 0.0100204, 
using 27 evaluations of the cost function. This accounts to 
a relative error in the u\ value of 0.204%, hence excellent 
recovery. We refrain from plotting the recovered output 
y(t, 6 es t) as it is visually indiscriminable from the output 
in the left of Figure 4. Interestingly, the optimization tar- 
get value West) = 3329.56 implies /OW) < /((9 re/ ), which 
may stem from a slight approximation error in the reduced 
model or from the effects of the measurement noise. 

The right plot in Figure 4 illustrates another applica- 
tion of reduced parametric models: We incorporated the 
model in an interactive graphical user interface in RBmat- 
lab, a matlab package for model order reduction, available 
for download at www.morepas . org. This allows inter- 
active parameter variations and instantaneous simulation 
response. 

Sensitivity analysis in a stochastic oscillator 
Network description 

The second case study is built on a genetic oscillator 
model showing stochastic resonance, which was pre- 
sented in [24]. The oscillator is based on a negative 
feedback loop between two genes with one gene having 
positive autoregulation. The oscillator is modelled as a 
reaction network with two species X\, X2, representing 
the gene products. The network reactions are specified 
in Table 3, with parameters as in Table 4. In the original 
model in [24], the dynamics were described as stochas- 
tic differential equation for the amounts of X\ and X2, 
coming from a Langevin approximation to the stochastic 
dynamics [12]. For the framework used in this paper, the 
dynamics have to be described directly by the underlying 



Table 4 Parameters for the oscillator model 



Table 3 The oscillator model 


Reaction 


Stoichiometry Vj 


Propensity Vj 


Production of Xi 


(1,0) T 


k] s 2 
k 2 s+x 2 


Degradation ofXi 


(-1,0) T 


/C3X1 


Production of Xj 


(0,1 ) T 


4 k 6 s 2 +xj 


Degradation of Xj 


(0,-1) T 


k 7 x 2 





k 2 


ks 


/C 4 


ks 


ke 


k 7 


s 


15i 


0.2 


4 


101 


1001 


6.5 


1001 


10 



Parameter values for the oscillator model from Table 3. 



CME. To achieve this, we introduce the parameter s which 
maps the dimensionless state variables from [24] to actual 
molecule numbers as required for the CME. Thus, s is also 
a measure for the networks noise level: the higher s, the 
larger the molecule number that is considered, and the 
smaller the noise level will be. 

The network model in Table 3 shows oscillations only 
in a stochastic description. The deterministic model has 
a unique asymptotically stable equilibrium point, but in 
a stochastic model, fluctuations may push the molecular 
numbers beyond a certain threshold, inducing a dynami- 
cal response along a slow manifold, which corresponds to 
one oscillatory period [24]. Depending on the noise level, 
such responses will be initiated more or less often, corre- 
sponding to a more or less regular oscillatory pattern. 

The system is truncated to the rectangle Q := 
{0, . . . , 300} x {0, . . . , 300}, which contains the relevant 
system states for the parameter ranges of interest. 

Similarly as in the switch example, the reaction propen- 
sity expressions contain rational terms in the parame- 
ters 5, 1<2, and /<6- These three cannot be decomposed 
directly, so we do the decomposition described in the 
methods section for the other five parameters only. With 
this decomposition, the truncated CME for the genetic 
oscillator can be written as 

P(t) = + k 3 A [2] + k 4 A [3] + k 5 A [4] + k 7 A [5] ^j P(t), 

(26) 

where A^ l \ i = 1, . . . , 5 are of dimension 301 2 x 301 2 = 
90601 x 90601. The initial condition for (26) is chosen 
as a uniform distribution over the rectangle {0, . . . , 50} x 
{0,...,50}: 



p(0 f x) = 



for %\ < 50, X2 < 50 



51 2 

0 otherwise. 



(27) 



List of reactions and reaction propensity functions for the oscillator model 
adopted from [24]. 



The time scale of interest for the model in (26) is for 
0 < t < T = 6. At the end of the interval, the probability 
distribution seems to approach a steady state. 

Some state plots are given in Figure 5. One observes 
a significant effect of the parameter I<4 on the amplitude 
of the oscillations. The simulation time for the detailed 
model was in average 7.3 minutes on a Dell desktop com- 
puter with 3.2 GHz dual-core Intel 4 processor and 1 GB 
RAM, without including the computation time for the 
construction of the state transition matrix A (0). 
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Basis generation 

For the basis generation, the parameter was assumed 
to vary within the interval [10, 100]. A reduced basis with 
the POD-Greedy algorithm was computed from a training 
set of 30 logarithmically equidistant parameters over the 
parameter domain ( Figure 6). As in the switch example, 
the target accuracy was chosen as e to i = 10 -12 , and the 
initial basis was chosen from the initial condition V\ := 
A). 

The POD-Greedy algorithm produces a basis of 109 
vectors, with an overall computation time of 16.5 hours 
on the hardware as in the previous subsection. The first 



20 basis vectors are shown in Figure 7. It is apparent 
that several of the basis vectors are directly included in 
order to reproduce the different amplitudes of oscilla- 
tions that will occur under variations of the parameter fcj. 
The error decay curve is shown in Figure 8, displaying an 
exponential error decay as also observed for the switch 
example. 

With the reduced basis V e R 90601x109 , we can con- 
struct a reduced parametric model for the CME of the 
oscillator as 

P r (t) = (k 4 A [ ? ] +A [ ; ] )P r (t) 

T (28) 
p r (0) = V T P(0), 

with4 3] = V T A^V e R 1 09xio9 andA M = yT + 
k 3 A [2] + k 5 A [4] +k 7 A^) V e R 109x109 . Note that since 
only I<4 has been varied in the reduction process, the other 
parameters are no longer present as parameters in the 
reduced model, but just take their nominal values. While 
the same basis V could be used to construct another 
reduced model where all parameters are retained, it is 
unlikely that this other model will be a good approxima- 
tion of the original one for varying values of the other 
parameters. 

Sensitivity analysis of the oscillation amplitude 

As an application of the reduced order parametric model 
obtained in the previous section, we study the variations of 
oscillatory amplitude over a parameter range. Specifically, 
we consider 200 equally spaced values for the parameter 
I<4 in the interval [12, 40] and compute the probability that 
the amount of X2 is larger than 100: 

Prob(* 2 > 100) = P( T > X )> ( 29 ) 




Figure 6 Parametric analysis results for the oscillator model. 

Sensitivity analysis of oscillation amplitude over a parameter interval. 
Blue line shows oscillatory amplitude over the parameter k 4 predicted 
from the reduced model. Red dots are validation results from a 
simulation of the original model. Triangles on the parameter axis 
indicate parameter values which were used in the construction of the 
reduced basis. 
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with T = 6 the final time of the simulation. The results 
are shown in Figure 6 and show a clear decay of oscillatory 
amplitude for increasing values of A4. Due to the signifi- 
cant time savings from the reduced model, this sensitivity 
curve can be computed with a high resolution. 

To evaluate the quality of the reduced model, we also 
computed the probability (29) using the original model 
(26) at two points within the considered interval for the 



parameter /cj. As shown in Figure 6, the results from the 
original model are in perfect agreement with the predic- 
tions from the reduced model at these points. Since the 
points at which the original model was evaluated in this 
experiment were not part of the training set (shown as 
triangles on the parameter axis in Figure 6), this shows 
that it is in fact possible to extrapolate the reduced model 
to parameter values that were not used to construct 
the basis. 




10 



20 40 60 80 100 

number of basis vectors 



120 



Figure 8 Results of the POD-Greedy algorithm for the oscillator 
model. Error decay curve for the oscillator model. 



Conclusions 

In this paper, we have introduced the application of para- 
metric model reduction methods to finite-state approxi- 
mations of the chemical master equation. We have also 
presented two case studies where these methods are 
applied to CME models of different networks in order to 
make parametric analysis tasks computationally efficient. 
By this, it has become clear that parametric model reduc- 
tion methods are a very useful tool for the analysis of 
stochastic biochemical reaction network described by the 
CME. 

Especially analysis tasks where many repeated simula- 
tions of a network with different parameter values are 
required can profit significantly from parametric model 
reduction. This includes for example sensitivity analysis or 
parameter optimization tasks such as identifiability anal- 
ysis or estimation. Moreover, the significant speedup of 
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the simulation for the reduced model allows an inter- 
active exploration of the networks dynamics within the 
parameter space within a suitable graphical user interface. 

This contribution is just a first step in the application 
of parametric model reduction methods to the CME. One 
particularly important aspect that we have not discussed 
here is the computation of error estimates for certifying 
that the simulation output of the reduced model is within 
some tolerance of the corresponding simulation output of 
the original model To maintain computational efficiency, 
the error estimation should be done without actually sim- 
ulating the original model. Error estimation methods have 
been developed for parametric model reduction of generic 
models [9], but tighter estimates could likely be obtained 
by taking into account the special structure of the CME 
models. Recent work for example refined the previous 
generic error bounds for stable models [25]. 
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